Skip to content

Conversation

@camilamacedo86
Copy link
Contributor

@camilamacedo86 camilamacedo86 commented Nov 1, 2025

Fix install errors leaking into deprecation conditions

Fixes deprecation conditions incorrectly showing install/validation errors instead of actual deprecation status.

Closes #2008

What was broken

When bundle installation or validation failed, the error message was getting copied to all conditions including the deprecation ones. This made deprecation status useless and confusing.

Example from the bug report - when a bundle failed validation:

- type: Progressing
  status: "True"
  reason: Retrying
  message: "validating bundle: has a dependency declared via property olm.package.required which is currently not supported"

- type: Deprecated
  status: "False"
  reason: Failed
  message: "validating bundle: has a dependency declared via property olm.package.required which is currently not supported"  # ❌ Wrong!

- type: PackageDeprecated
  status: "False"
  reason: Failed
  message: "validating bundle: has a dependency declared via property olm.package.required which is currently not supported"  # ❌ Wrong!

- type: BundleDeprecated
  status: "False"
  reason: Failed
  message: "validating bundle: has a dependency declared via property olm.package.required which is currently not supported"  # ❌ Wrong!

The deprecation conditions should show deprecation information from the catalog, not installation errors.

While fixing this, we also discovered and fixed several other related issues:

  • Deprecation showed False when catalog was removed (should be Unknown)
  • BundleDeprecated showed the resolved bundle instead of the installed bundle
  • Boxcutter rollouts showed False for deprecation even though no catalog was checked

Key Differences (What This PR Fixes)

Aspect Main This PR
Channel filtering ✅ Same ✅ Same
Catalog selection ✅ Same (resolver picks one) ✅ Same
When deprecation set Only after successful resolution After resolution, after install, during rollout
Catalog unavailable Shows False ❌ (wrong!) Shows Unknown
Resolution fails Deprecation lost ❌ Deprecation shown ✅
Bundle reference Uses resolved bundle ❌ Uses installed bundle ✅
Install errors Leak into deprecation conditions ❌ Stay in Progressing/Installed ✅

What you'll see now

Example 1: The original bug - validation failure

Before this fix:

- type: PackageDeprecated
  status: "False"
  reason: Failed
  message: "validating bundle: has a dependency declared via property olm.package.required which is currently not supported"  #

After this fix:

- type: PackageDeprecated
  status: "False"      # ✅ Shows actual deprecation status from catalog
  reason: Deprecated
  message: ""          # ✅ No install error here!

Install errors only appear in Progressing and Installed conditions.

Example 2: Catalog gets removed

Before this fix:

# Bundle v1.0.0 installed, then catalog deleted
- type: BundleDeprecated
  status: "False"      # ❌ Claims "not deprecated" when we can't check
  reason: Absent       # ❌ Wrong - bundle exists!

After this fix:

# Bundle v1.0.0 installed, then catalog deleted
- type: BundleDeprecated
  status: "Unknown"    # ✅ Honest - we can't check
  reason: Deprecated   # ✅ Bundle exists, catalog unavailable

Example 3: Upgrading v1→v2

Before this fix:

# v1.0.0 running, resolved v1.0.1 (not installed yet)
- type: BundleDeprecated
  status: "True"
  message: "Bundle v1.0.1 is deprecated"  # ❌ Shows v1.0.1 before it's installed!

After this fix:

# v1.0.0 running, resolved v1.0.1 (not installed yet)
- type: BundleDeprecated
  status: "False"      # ✅ Shows v1.0.0's status (what's actually running)
  message: ""

Example 4: Resolution fails but package is deprecated

Before this fix:

# No bundles match version constraint, but package IS deprecated in catalog
- type: PackageDeprecated
  status: "False"      # ❌ Warning lost!

After this fix:

# No bundles match version constraint, but package IS deprecated in catalog
- type: PackageDeprecated
  status: "True"       # ✅ Warning shown even though resolution failed
  message: "Package foo is deprecated, please migrate to bar"

Behavior summary

Scenario PackageDeprecated ChannelDeprecated BundleDeprecated
Working, not deprecated False False False
Working, is deprecated True True True
Install/validation error True/False (from catalog) True/False (from catalog) Unknown/Absent
Catalog removed Unknown Unknown Unknown
Boxcutter rollout (no catalog check) Unknown Unknown Unknown
Upgrading v1→v2 (v2 not installed yet) From catalog From catalog Shows v1

Closes #2008

Copilot AI review requested due to automatic review settings November 1, 2025 07:58
@camilamacedo86 camilamacedo86 requested a review from a team as a code owner November 1, 2025 07:58
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 1, 2025
@openshift-ci openshift-ci bot requested a review from dtfranz November 1, 2025 07:58
@netlify
Copy link

netlify bot commented Nov 1, 2025

Deploy Preview for olmv1 ready!

Name Link
🔨 Latest commit d84278c
🔍 Latest deploy log https://app.netlify.com/projects/olmv1/deploys/691b164f2fd32c0008ae79a3
😎 Deploy Preview https://deploy-preview-2296--olmv1.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@openshift-ci openshift-ci bot requested a review from oceanc80 November 1, 2025 07:58
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors deprecation status handling in ClusterExtension reconciliation to ensure deprecation conditions accurately reflect catalog data and installed bundle state, preventing install/validation errors from leaking into deprecation conditions.

  • Moved deprecation status updates to a deferred function that runs at the end of reconciliation
  • Changed BundleDeprecated condition to use Unknown status with ReasonAbsent when no bundle is installed
  • Updated test expectations to handle multiple possible error messages for sourceType validation

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
internal/operator-controller/controllers/clusterextension_controller.go Refactored deprecation status logic with deferred updates and new condition semantics for uninstalled bundles
internal/operator-controller/controllers/clusterextension_controller_test.go Added tests for deprecation handling with resolution failures and applier failures; updated test expectations for BundleDeprecated conditions
test/e2e/cluster_extension_install_test.go Updated e2e tests to verify deprecation conditions in success and failure scenarios
internal/operator-controller/controllers/clusterextension_admission_test.go Enhanced sourceType validation test to handle multiple possible error message formats

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings November 1, 2025 08:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link

codecov bot commented Nov 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.41%. Comparing base (c06f27f) to head (d84278c).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2296      +/-   ##
==========================================
+ Coverage   74.30%   74.41%   +0.11%     
==========================================
  Files          91       91              
  Lines        7083     7118      +35     
==========================================
+ Hits         5263     5297      +34     
- Misses       1405     1407       +2     
+ Partials      415      414       -1     
Flag Coverage Δ
e2e 45.81% <81.05%> (+0.17%) ⬆️
experimental-e2e 48.63% <85.26%> (+0.23%) ⬆️
unit 58.75% <97.89%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@camilamacedo86 camilamacedo86 changed the title WIP: 🐛 (fix):Fix deprecation conditions leaking install errors 🐛 (fix):Fix deprecation conditions leaking install errors Nov 1, 2025
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 1, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

internal/operator-controller/controllers/clusterextension_controller_test.go:1

  • The assertion require.Contains(ct, []string{...}, cond.Reason) is semantically backwards. The Contains function expects a slice as the second argument and the search item as the first. This should be require.Contains(ct, cond.Reason, []string{ocv1.ReasonFailed, ocv1.ReasonAbsent}) or use require.True(ct, slices.Contains([]string{ocv1.ReasonFailed, ocv1.ReasonAbsent}, cond.Reason)).
package controllers_test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings November 11, 2025 11:02
Copilot finished reviewing on behalf of camilamacedo86 November 11, 2025 11:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@joelanford
Copy link
Member

Not sure if this has been addressed in other comments (if so, please update the PR description!).

I noticed that we have Reason: Deprecated even when that status is Unknown or False. That should be fixed too.

  • False should have Reason: NotDeprecated (or Current, Active, etc.)
  • Unknown should perhaps have something about why it is unknown? (e.g. DeprecationMetadataUnavailable or something?)

@joelanford
Copy link
Member

One thing I didn't see answered in the description is the open question in the original TODO comment about what happens when the package and/or channel is present in multiple catalogs, but we didn't resolve a bundle:

  //       - Open question though: what if different catalogs have different opinions of what's deprecated.
  //         If we can't resolve a bundle, how do we know which catalog to trust for deprecation information?
  //         Perhaps if the package shows up in multiple catalogs and deprecations don't match, we can set
  //         the deprecation status to unknown? Or perhaps we somehow combine the deprecation information from
  //         all catalogs?

@joelanford
Copy link
Member

Is there an RFC for this? It kinda feels like we should have something written up in that form? I also wonder if we should tackle the question of whether to explicitly include the Deprecation conditions when we would set them as False. They add a lot of noise to the YAML/JSON and describe output. And it feels like absence of these conditions would be reasonable interpretted as "not deprecated".

Copilot AI review requested due to automatic review settings November 13, 2025 06:42
Copilot finished reviewing on behalf of camilamacedo86 November 13, 2025 06:44
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@camilamacedo86
Copy link
Contributor Author

camilamacedo86 commented Nov 13, 2025

Hi @joelanford

Thanks for the feedback! You're absolutely right - we should keep the TODO and open question for item 2. I've done that. See the updated TODO with scenarios here: internal/operator-controller/controllers/clusterextension_controller.go

The core logic for handling multiple catalogs hasn't changed - the resolver still picks one. What changed is we now correctly handle the failure cases and don't lose deprecation information when things go wrong. For that yes, I think we need a follow up PR/discussion. ( out of scope )

Just to clarify the scope: this PR solves item 1 (use installed bundle instead of resolved) and #2008 (install errors leaking into deprecation conditions). The open question about conflicting deprecations from multiple catalogs is a separate edge case that deserves its own focused discussion and PR (IHMO, lets go step by step).

Key Differences (What This PR Fixes)

Aspect Main This PR
Channel filtering ✅ Same ✅ Same
Catalog selection ✅ Same (resolver picks one) ✅ Same
When deprecation set Only after successful resolution After resolution, after install, during rollout
Catalog unavailable Shows False ❌ (wrong!) Shows Unknown
Resolution fails Deprecation lost ❌ Deprecation shown ✅
Bundle reference Uses resolved bundle ❌ Uses installed bundle ✅
Install errors Leak into deprecation conditions ❌ Stay in Progressing/Installed ✅

Hope this addresses your concern! Let me know if you'd like me to clarify anything else.
I think we need to solve per parts, based on the above you still think that we need a RFC to move forward with the bug fix here, those changes or can we move forward and track the need to create a RFC to discuss item 2.

This comment was marked as resolved.

… and prevent error leak

When a catalog is removed or unreachable, deprecation conditions now correctly show Unknown instead of False. BundleDeprecated now reflects the installed bundle and uses the correct reason.

Assisted-by: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Install error message getting propagated to most conditions

5 participants